A Bayesian Network Model for Interesting Itemsets

نویسندگان

  • Jaroslav M. Fowkes
  • Charles A. Sutton
چکیده

Mining itemsets that are the most interesting under a statistical model of the underlying data is a commonly used and well-studied technique for exploratory data analysis, with the most recent interestingness models exhibiting state of the art performance. Continuing this highly promising line of work, we propose the first, to the best of our knowledge, generative model over itemsets, in the form of a Bayesian network, and an associated novel measure of interestingness. Our model is able to efficiently infer interesting itemsets directly from the transaction database using structural EM, in which the E-step employs the greedy approximation to weighted set cover. Our approach is theoretically simple, straightforward to implement, trivially parallelizable and retrieves itemsets whose quality is comparable to, if not better than, existing state of the art algorithms as we demonstrate on several real-world datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification as Mining and Use of Labeled Itemsets

We investigate the relationship between association and classification mining. The main issue in association mining is the discovery of interesting patterns of the data, so called itemsets. We introduce the notion of labeled itemsets and derive the surprising result that classification techniques such as decision trees, Naïve Bayes, Bayesian networks and Nearest Neighbor classifiers can all be ...

متن کامل

Risk Analysis of Operating Room Using the Fuzzy Bayesian Network Model

To enhance Patient’s safety, we need effective methods for risk management. This work aims to propose an integrated approach to risk management for a hospital system. To improve patient’s safety, we should develop flexible methods where different aspects of risk and type of information are taken into consideration. This paper proposes a fuzzy Bayesian network to model and analyze risk in the op...

متن کامل

An efficient Bayesian network approach for discovering interesting patterns

The main problem faced by all association rule/pattern mining algorithms is their production of a large number of rules which incurred a secondary mining problem; namely, mining interesting association rules/patterns. The problem is compounded by the fact that ‘common knowledge’ discovered rules are not interesting, but they are usually strong rules with high support and confidence levels – the...

متن کامل

A Probabilistic Model for COPD Diagnosis and Phenotyping Using Bayesian Networks

Introduction: This research was meant to provide a model for COPD diagnosis and to classify the cases into phenotypes; General COPD, Chronic bronchitis, Emphysema, and the Asthmatic COPD using a Bayesian Network (BN). Methods: The model was constructed through developing the Bayesian Network structure and instantiating the parameters for each of the variables. In order to validate the achiev...

متن کامل

A Surface Water Evaporation Estimation Model Using Bayesian Belief Networks with an Application to the Persian Gulf

Evaporation phenomena is a effective climate component on water resources management and has special importance in agriculture. In this paper, Bayesian belief networks (BBNs) as a non-linear modeling technique provide an evaporation estimation  method under uncertainty. As a case study, we estimated the surface water evaporation of the Persian Gulf and worked with a dataset of observations ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016